November, 2020

Introduction

  • A common neuroscience topic is to detect the temporal order of two stimuli
  • These studies are often interested in making inferences at the group level and at an individual level
  • Multilevel models allow for simultaneous estimation of population effects and group level effects
  • Stan allows for flexible Bayesian modeling

Multilevel models

  • In classical regression, a simple single-level slope-intercept model can be specified as
    • \(y_i = \alpha + \beta x_i + \epsilon_i\)
  • If there is a categorical variable, then a varying-slopes varying-intercepts model can be specified as
    • \(y_i = \alpha_{j[i]} + \beta_{j[i]} x_i + \epsilon_i\)
  • In a multilevel model, the coefficients are modeled by a separate regression
    • \[\begin{equation*}\begin{split} y_i &= \alpha_{j[i]} + \beta_{j[i]} x_i \\ \alpha_j &\sim \mathcal{N}(a_0 + a_1 u_j, \sigma_{\alpha}^2) \\ \beta_j &\sim \mathcal{N}(b_0 + b_1 u_j, \sigma_{\beta}^2) \\ \end{split}\end{equation*}\]

Bayesian statistics

The goal is to update our current state of information (the prior) with the incoming data (given its likelihood) to receive an entire probability distribution reflecting our new beliefs (the posterior), with all modeling assumptions made explicit

\[ P(\theta | data) = \frac{P(data | \theta) P(\theta)}{\int_\Omega P(data | \theta) P(\theta) d\theta} \]

Methods

Hamiltonian Monte Carlo

Model Checking

  • Methods available to all MCMC samplers
    • Trace plots
    • Split \(\hat{R}\)
    • Effective sample size
  • Divergent transitions
    • When the simulated trajectory is far from the true trajectory as measured by the Hamiltonian
    • Specific to HMC

Estimating predictive performance

  • Predictive performance is a reliable metric for model comparison
  • A model that can predict well likely captures the regular features of the observed data and the data generating model
  • Can be estimated through leave-one-out cross-validation (LOOCV)
  • LOOCV is expensive, but can be estimated through Pareto-smoothed importance sampling (PSIS)

Non-centered parameterization

\[ \begin{equation} \begin{split} X &\sim \mathcal{N}(\mu, \sigma^2) \end{split} \quad \Longrightarrow \quad \begin{split} Z &\sim \mathcal{N}(0, 1^2) \\ X &= \mu + \sigma \cdot Z \end{split} \end{equation} \]

\[ \begin{equation} \begin{split} X &\sim \mathrm{Lognormal}(\mu, \sigma^2) \end{split} \quad \Longrightarrow \quad \begin{split} Z &\sim \mathcal{N}(0, 1^2) \\ X &= \exp\left(\mu + \sigma \cdot Z\right) \end{split} \end{equation} \]

\[ \begin{equation} \begin{split} X &\sim \mathrm{Cauchy}(\mu, \tau) \end{split} \quad \Longrightarrow \quad \begin{split} U &\sim \mathcal{U}\left(-\frac{\pi}{2}, \frac{\pi}{2}\right) \\ X &= \mu + \tau \cdot \tan(U) \end{split} \end{equation} \]

Background to Pyschometrics

Short history

  • Charles Darwin developed the idea that living organisms adapt in order to better survive in their environment
  • Sir Francis Galton became interested in the differences in human beings and in how to measure those differences
  • Johann Friedrich Herbart was interested in studying consciousness through the scientific method, and is responsible for creating mathematical models of the mind
  • E.H. Weber sought out to prove the idea of a psychological threshold
    • A minimum stimulus intensity necessary to activate a sensory system
  • G.T. Fechner devised Weber-Fechner Law that states that the strength of a sensation grows as the logarithm of the stimulus intensity

Psychometric experiments

  • If there are two distinct stimuli occurring nearly simultaneously then our brains will bind them into a single percept
  • The temporal delay between stimuli is called the stimulus onset asynchrony (SOA)
  • The range of SOAs for which sensory signals are integrated into a global percept is called the temporal binding window
  • When the SOA grows large enough, the brain segregates the two signals and the temporal order can be determined

Quantities of interest

  • Perceptual Synchrony
    • The temporal delay between two signals at which an observer is unsure about their temporal order
    • Studied through the point of subjective simultaneity (PSS)
  • Temporal Sensitivity
    • The ability to discriminate the timing of multiple sensory signals is referred to as temporal sensitivity
    • Studied through the just noticeable difference (JND)
    • The smallest lapse in time so that a temporal order can just be determined.

Chart of the psychometric function

Psychometric research interests

  • How age informs perceptual synchrony and temporal sensitivity
  • How temporal recalibration affects perceptual synchrony and temporal sensitivity
  • How age affects temporal recalibration

Four temporal order judgment tasks

  • Audiovisual - between an audible tone and a circle flashing on a screen
    • One subject’s post-adaptation block removed from the data: O-f-CE
  • Visual - between two circles flashing on a screen
  • Duration - the time duration between two circles displayed on a screen
  • Sensorimotor - between the press of a button and a circle flashed on a screen

Glimpse at the data

#> # A tibble: 2,772 x 7
#>      soa     k     n sid     task        trial age_group  
#>    <int> <int> <int> <fct>   <fct>       <fct> <fct>      
#>  1   250     3     3 O-m-DC  audiovisual post2 older_adult
#>  2   450     3     3 O-m-BC  audiovisual post1 older_adult
#>  3   -50     0     3 Y-m-DTF audiovisual post2 young_adult
#>  4  -150     0     3 M-f-JM  audiovisual post1 middle_age 
#>  5    50     1     3 Y-f-CJ  audiovisual post1 young_adult
#>  6  -350     0     5 O-m-BC  audiovisual pre   older_adult
#>  7  -500     0     5 M-m-BT  audiovisual pre   middle_age 
#>  8  -500     0     3 Y-f-MW  audiovisual post1 young_adult
#>  9  -300     0     3 Y-f-CM  audiovisual post1 young_adult
#> 10   100     3     3 M-m-WW  audiovisual post1 middle_age 
#> # … with 2,762 more rows

Developing a Model

for the Psychometric Function

Modeling psychometric quantities

  • Real-world properties inform modeling choices
    • PSS and JND quantities are estimated from the psychometric function
    • We have prior information about PSS and JND
  • We have control over:
    • Choice of psychometric function
    • Choice of linear parameterization
    • Choice of priors

Choice of psychometric function

  • Logistic and Gaussian have no skewness
  • Gaussian puts predictor on standard (probit) scale
  • Logistic puts predictor on log-odds (logit) scale
  • Logistic is the “canonical” choice for Binomial regression

Logistic function

\[ F(\theta) = \frac{1}{1 + \exp(-\theta)} \]

\[ F^{-1}(\pi) = \mathrm{logit}(\pi) = \ln\left(\frac{\pi}{1-\pi}\right) \]

The “link” is the logit function

\[ \pi = F(\theta) \Longleftrightarrow F^{-1}(\pi) = F^{-1}(F(\theta)) = \theta \]

Choice of linear parameterization

The PSS and JND are defined as

\[ \begin{align*} pss &:= F^{-1}(0.5; \alpha, \beta) \\ jnd &:= F^{-1}(0.84; \alpha, \beta) - F^{-1}(0.5; \alpha, \beta) \end{align*} \]

“Slope-intercept” linear parameterization

\[ \begin{equation*} \theta = \alpha + \beta x \Longrightarrow \begin{split} pss &= -\frac{\alpha}{\beta} \\ jnd &= \frac{\mathrm{logit}(0.84)}{\beta} \end{split} \end{equation*} \]

“Slope-location” linear parameterization

\[ \begin{equation*} \theta = \beta (x - \alpha) \Longrightarrow \begin{split} pss &= \alpha \\ jnd &= \frac{\mathrm{logit}(0.84)}{\beta} \end{split} \end{equation*} \]

Conceptual analysis

  • Subjects are presented with two stimuli separated by some temporal delay
  • They are asked to respond as to their perception of the temporal order
  • As the SOA becomes larger in the positive direction, subjects are expected to give more “positive” responses
  • After the first experimental block the subjects go through a temporal recalibration period, and repeat the experiment

Observational space

  • The response that subjects give during a TOJ task is recorded as a zero or a one
    • \(y_i \in \lbrace 0, 1\rbrace\)
  • If the SOA values are fixed then we have repeated measurements for each SOA and the responses can be aggregated into Binomial counts
    • \(k_i, n_i \in \mathbb{Z}_0^+, k_i \le n_i\)
    • \(k_i \sim \mathrm{Binomial}(n_i, p_i)\)

Selecting priors - Point of subjective simultaneity

  • The point of subjective simultaneity can be either positive or negative
  • Some studies suggest that the separation between stimuli need to be as little as 20ms for subjects to be able to determine temporal order
  • Other studies suggest that our brains can detect temporal differences as small as 30ms
  • We should be skeptical of PSS estimates larger than say 150ms in absolute value
  • \[pss \sim \mathcal{N}(0, 0.06^2) \Longleftrightarrow \alpha \sim \mathcal{N}(0, 0.06^2)\]

Selecting priors - Just noticeable difference

  • It is impossible that a properly conducted block would result in a JND less than 0
  • It is also unlikely that the just noticeable difference would be more than a second
  • Some studies show that an input lag as small as 100ms can impair a person’s typing ability
  • Use Log-Normal distribution to model the JND
  • a priori set prior so that \(median[X] \approx 0.100\) and \(P(X < 1) \approx 0.99\)
    • \[jnd \sim \mathrm{Lognormal}(-2.3, 1^2)\]
    • \[\beta = \frac{\mathrm{logit}(0.84)}{jnd} \Longleftrightarrow \beta \sim \mathrm{Lognormal}(2.8, 1^2)\]

Baseline model

\[ \begin{equation*} \begin{split} k_i &\sim \mathrm{Binomial}(n_i, p_i) \\ \mathrm{logit}(p_i) &= \beta \cdot (x_i - \alpha) \\ \alpha &\sim \mathcal{N}(0, 0.06^2) \\ \beta &\sim \mathrm{Lognormal}(3, 1^2) \end{split} \end{equation*} \]

model {
  vector[N] p = beta * (x - alpha);
  alpha ~ normal(0, 0.06);
  beta ~ lognormal(3.0, 1.0);
  k ~ binomial_logit(n, p);
}

Prior predictive checks - psychometric function

Prior predictive checks - PSS and JND

Adding age and block

  • Age and block are categorical variables
  • They are necessary to answer the question of how age informs perceptual synchrony and temporal sensitivity

Three ways to add a categorical variable

\[ \begin{equation} \begin{split} \mu_\alpha &= \alpha_{trt[i]} \\ \alpha_{trt} &\sim \mathcal{N}(0, 0.06^2) \end{split} \qquad \begin{split} \mu_\alpha &= \alpha_{trt[i]} \\ \alpha &\sim \mathcal{N}(0, 0.06^2) \\ \alpha_{trt} &\sim \mathcal{N}(\alpha, \sigma_{trt}^2) \\ \sigma_{trt} &\sim \pi_{\sigma} \end{split} \qquad \begin{split} \mu_\alpha &= \alpha + \alpha_{trt[i]} \\ \alpha &\sim \mathcal{N}(0, 0.06^2) \\ \alpha_{trt} &\sim \mathcal{N}(0, \sigma_{trt}^2) \\ \sigma_{trt} &\sim \pi_{\sigma} \end{split} \end{equation} \]

  • (Left) Categorical variable replaces the intercept so each level explicitly gets its own intercept
  • (Center) Each categorical variable is modeled as having the same mean and standard deviation that is learned from the data (centered multilevel)
  • (Right) Same as center, but using the non-centered parameterization
    • Most convenient when adding multiple categorical variables

Using the non-centered log-normal parameterization

\[ \begin{align*} X &\sim \mathcal{N}(3, 1^2) \\ Y &= \exp\left(X\right) \Longleftrightarrow Y \sim \mathrm{Lognormal(3, 1^2)} \end{align*} \]

Adding categorical variables to the slope term is more tractable

\[ \begin{equation} \begin{split} k_i &\sim \mathrm{Binomial}(n_i, p_i) \\ \mathrm{logit}(p_i) &= \exp(\mu_\beta) (x_i - \mu_\alpha) \\ \mu_\alpha &= \alpha + \alpha_{trt[i]} + \alpha_{G[i]} \\ \mu_\beta &= \beta + \beta_{trt[i]} + \beta_{G[i]} \\ \end{split} \end{equation} \]

Model 2 - Checking the posterior

Model 2 - Posterior predictive distribution

Modeling interactions

  • There is a problem with the previous model
    • It measures the average difference in blocks, and the average difference in age groups
    • Does not consider any interaction between the two
    • Model assumes that temporal recalibration affects all age groups the same
  • Modeling interactions between categorical variables by creating a new categorical variable
    • Cross-product between factor levels
    • If \(A\) has levels \(a, b, c\) and \(B\) has levels \(x, y\), then the interaction variable \(C=A\times B\) will have levels \(ax, ay, bx, by, cx, cy\)

Model 3 - Posterior predictive distribution

Incorporating a lapse rate

  • A lapse in judgment can happen for any reason, and is assumed to be random and independent of other lapses
  • Lapses can be modeled as occurring independently at some fixed rate
  • The underlying psychometric function, \(F\), is bounded by some lower and upper lapse rate
  • For a given lower and upper lapse rate \(\lambda\) and \(\gamma\), the performance function \(\Psi\) is
    • \[\Psi(x; \alpha, \beta, \lambda, \gamma) = \lambda + (1 - \lambda - \gamma) F(x; \alpha, \beta)\]

Chart of a lapse rate model

Modeling the lapse rate

\[ \begin{equation} \begin{split} k_i &\sim \mathrm{Binomial}(n_i, p_i) \\ p_i &= \lambda + (1 - 2\lambda)\cdot F\left(\exp(\mu_\beta) (x_i - \mu_\alpha)\right) \\ \mu_\alpha &= \alpha + \alpha_{G[i] \times trt[i]} \\ \mu_\beta &= \beta + \beta_{G[i] \times trt[i]} \\ \end{split} \end{equation} \]

model {
  vector[N] p;
  for (i in 1:N) {
    real mu_b = b + bGT[G[i], trt[i]];
    real mu_a = a + aGT[G[i], trt[i]];
    p[i] = lG[G[i]] + (1 - 2*lG[G[i]]) * inv_logit(exp(mu_b) * (x[i] - mu_a));
  }
  k ~ binomial(n, p);
}

Result of modeling a lapse rate

Including a lapse rate improves predictive performance

l043 <- loo(readRDS("../models/m043vis.rds"))
l044 <- loo(readRDS("../models/m044vis.rds"))
comp_vis <- loo_compare(l043, l044)
#> # A tibble: 2 x 6
#>   Model    elpd_diff  se_diff    elpd_loo   p_loo      se_p_loo  
#>   <fct>    <compar.l> <compar.l> <compar.l> <compar.l> <compar.l>
#> 1 Lapse       0.0000   0.00000   -1001.108  19.21726   1.901839  
#> 2 No Lapse -259.3614  31.92362   -1260.469  23.09895   2.259036

Adding subjects

  • Some researchers are interested in making inferences at an individual level
  • Modeling at the subject level can improve

Final model

\[ \begin{equation*} \begin{split} k_i &\sim \mathrm{Binomial}(n_i, p_i) \\ p_i &= \lambda_{G[i]} + (1 - 2\lambda_{G[i]})\exp(\mu_\beta) (x_i - \mu_\alpha) \\ \mu_\alpha &= \alpha + \alpha_{G[i], trt[i]} + \alpha_{S[i]} \\ \mu_\beta &= \beta + \beta_{G[i], trt[i]} + \beta_{S[i]} \\ \lambda_{G} &\sim \mathrm{Beta}(4, 96) \\ \hat{\alpha}, \hat{\alpha}_{G\times trt}, \hat{\alpha}_{S} &\sim \mathcal{N}(0, 1^2) \\ \hat{\beta}, \hat{\beta}_{G\times trt}, \hat{\beta}_{S} &\sim \mathcal{N}(0, 1^2) \\ \hat{\sigma}_{G\times trt}, \hat{\gamma}_{G\times trt}, \hat{\tau}_{S}, \hat{\nu}_{S} &\sim \mathcal{U}(0, \pi/2) \\ \end{split} \qquad \begin{split} \alpha &= 0.06 \cdot \hat{\alpha} \\ \alpha_{G\times trt} &= \sigma_{G\times trt} \cdot \hat{\alpha}_{trt} \\ \alpha_{S} &= \tau_{S} \cdot \hat{\alpha}_{S} \\ \sigma_{G \times trt} &= \tan(\hat{\sigma}_{G \times trt}) \\ \tau_{S} &= \tan(\hat{\tau}_{S}) \\ \beta &= 3 + 1 \cdot \hat{\beta} \\ \beta_{G\times trt} &= \gamma_{G\times trt} \cdot \hat{\beta}_{G\times trt} \\ \beta_{S} &= \nu_{S} \cdot \hat{\beta}_{S} \\ \gamma_{G\times trt} &= \tan(\hat{\gamma}_{G\times trt}) \\ \nu_{S} &= \tan(\hat{\nu}_{S}) \end{split} \end{equation*} \]

Results

On perceptual synchrony (PSS)

  • Temporal recalibration has a small-yet-consistent affect on audio-visual TOJ
  • The visual TOJ task shows some evidence for differences between age groups
  • Temporal recalibration has an affect for the duration TOJ task

PSS - Audiovisual

PSS - Visual

PSS - Duration

PSS - Sensorimotor

On temporal sensitivity

  • Temporal recalibration increases temporal sensitivity in the audiovisual and visual TOJ tasks
  • The older age group generally lags behind the younger and middle age groups
  • There are no visually significant changes in the duration TOJ task
  • The younger age group saw heighten temporal sensitivity in the sensorimotor TOJ task

JND - Audiovisual

JND - Visual

JND - Duration

JND - Sensorimotor

Lapses

Subject level inferences

  • Modeling subjects allows for comparison between subjects
  • Highlights the variation within groups
  • Can be used to make predictions for subjects that only partially participated in a task
    • Subject O-f-CE completed the pre-adaptation block, but was removed from the post-adaptation block

Subject-specific examples

Discussion

Improving psychometric experiments

  • The results here are based on small-scale preliminary data
  • There is an inherent difficulty for the different tasks
    • Visual -> Audiovisual -> Duration -> Sensorimotor
    • Indicated by temporal sensitivity and lapse rates
  • Lapses should be independent of the task
    • Distribution of lapse rates across ages and tasks shows this assumption is violated
    • Better estimation of lapse rates by presenting larger SOAs in the experiment
  • In the visual TOJ task, the granularity in the SOA values near the PSS could be increased to get more reliable estimates of the slope and to avoid complete separation

Future work

  • Fitting a multitask model
    • Borrowing information across different groups improves parameter estimation
  • Explore a causal inference model for the psychometric function
    • Current model only measures association between temporal recalibration and the PSS/JND
  • Improve estimation of predictive performance by considering leave-one-group-out cross-validation

Conclusion

  • We have

Thank you